Frans Pop: The case of the self-perpetuating DNS errors
Ingredients:
- some lame DNS server
- logcheck
- spamassassin
named: connection refused resolving 'somedomain.org/NS/IN': xxx.yyy.zzz.nnn#53
named: connection refused resolving 'somedomain.org/NS/IN': xxx.yyy.zzz.mmm#53
named: connection refused resolving 'ns1.somedomain.org/AAAA/IN': xxx.yyy.zzz.mmm#53
named: connection refused resolving 'ns2.somedomain.org/AAAA/IN': xxx.yyy.zzz.mmm#53
named: connection refused resolving 'ns1.somedomain.org/AAAA/IN': xxx.yyy.zzz.nnn#53
named: connection refused resolving 'ns2.somedomain.org/AAAA/IN': xxx.yyy.zzz.nnn#53
The times were fairly regular: once just before the hour, most 2 minutes
after. I fetch mail at around that time, but also at other times, so
possible but unlikely. The 2 minutes after was the first real clue:
some cron job maybe? After disabling logcheck the messages no longer
appeared in the log. Enable it again, and they were back.
Additional confusion was caused by the fact that the domain had "debian"
in its name, but it was somewhere obscure. So why was logcheck causing
a lookup for that domain? This did confuse me enough to waste some time
looking for some silly weird (default) configuration problem in some package.
Enter spamassassin. Apparently that was parsing the message body, recognized
"somedomain.org" as a host name, and proceded to do a DNS lookup as validity
check.
So we have the following loop, started off by something causing an initial
DNS lookup for the domain, which fails and gets logged:
- logcheck reports the failure during its next check
- spamassassin processes the logcheck mail, spots the domain name and does a new set of lookups, which fail and get logged
- logcheck reports the failures during its next check
- ...
logging
category lame-servers null; ;
;
Anyway, now I just no longer pass logcheck mails through spamassassin.
(Although filtering out these DNS errors in bind9 can be perfectly valid.)